A Binary Competition Tree for Reinforcement Learning
نویسندگان
چکیده
A robust, general and computationally simple reinforcement learning system is presented. It uses channel a representation which is robust and continuous. The accumulated knowledge is represented as a reward prediction function in the outer product space of the inputand output channel vectors. Each computational unit generates an output simply by a vector-matrix multiplication and the response can therefore be calculated fast. The response and a prediction of the reward are calculated simultaneously by the same system, which makes TD-methods easy to implement if needed. Several units can cooperate to solve more complicated problems. A dynamic tree structure of linear units is grown in order to divide the knowledge space into a su ciently number of regions in which the reward function can be properly described. The tree continuously tests splitand prune criteria in order to adapt its size to the complexity of the problem.
منابع مشابه
A Dynamic Tree Structure for Incremental Reinforcement Learning of Good Behavior
This paper addresses the idea of learning by reinforcement, within the theory of behaviorism. The reason for this choice is its generality and especially that the reinforcement learning paradigm allows systems to be designed, which can improve their behavior beyond that of their teacher. The role of the teacher is to deene the reinforcement function, which acts as a description of the problem t...
متن کاملReinforcement Learning Trees
Two new reinforcement learning algorithms are presented. Both use a binary tree to store simple local models in the leaf nodes and coarser global models towards the root. It is demonstrated that a meaningful partitioning into local models can only be accomplished in a fused space consisting of both input and output. The rst algorithm uses a batch like statistic procedure to estimate the reward ...
متن کاملThe learning tree, a new concept in learning
In this paper learning is considered to be the bootstrapping procedure where fragmented past experience of what to do when performing well is used for generation of new responses adding more information to the system about the environment. The gained knowledge is represented by a behavior probability density function which is decomposed into a number of normal distributions using a binary tree....
متن کاملHierarchical Reinforcement Learning Based Self-balancing Algorithm for Two-wheeled Robots
Abstract: Self-balancing control is the basis for applications of two-wheeled robots. In order to improve the self-balancing of twowheeled robots, we propose a hierarchical reinforcement learning algorithm for controlling the balance of two-wheeled robots. After describing the subgoals of hierarchical reinforcement learning, we extract features for subgoals, define a feature value vector and it...
متن کاملLearning in Higher-Order "Artificial Dendritic Trees"
If neurons sum up their inputs in a non-linear way, as some simulations suggest, how is this distributed fine-grained non-linearity exploited during learning? How are all the small sigmoids in synapse, spine and dendritic tree lined up in the right areas of their respective input spaces? In this report, I show how an abstract atemporal highly nested tree structure with a quadratic transfer func...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1994